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LONG-TERM  GOALS 

The  Macaulay  Library  is  the  world’s  largest  archive  of  animal  sounds  and  has  been  selected  by  the 
Office  of  Naval  Research  as  a  major  repository  for  the  deposition,  digital  archival,  review,  and 
retrieval  of  the  many  recordings  of  marine  animals  made  over  the  last  half-century.  Archived  marine 
recordings  pose  challenging  retrieval  problems  given  the  typically  long  intervals  of  silence  between 
animal  sounds  and  the  multiplicity  of  species  detectable  in  any  given  recording.  Our  long-term  goal  is 
to  design  tools  and  interactive  interfaces  that  streamline  and  enhance  the  search  and  retrieval  functions 
of  the  marine  archive,  yet  provide  maximal  access  to  the  associated  metadata. 

OBJECTIVES 

One  specific  objective  of  this  project  is  to  design  software  that  will  permit  remote  experts  to  annotate 
the  content  of  long  recordings  archived  at  the  Macaulay  Library  through  their  web  browsers. 
Annotations  will  permit  subsequent  searches  of  the  archive  database  to  retrieve  not  only  suitable 
recordings,  but  also  those  parts  of  a  recording  meeting  the  search  criteria.  A  second  main  objective  is 
to  define  and  extract  a  set  of  acoustic  features  from  all  archived  marine  recordings  that  can  be  used  in 
subsequent  search  and  retrieval  tasks.  A  third  task  is  to  refine  the  geographical  location  data  for  each 
recording  in  preparation  for  collaborative  linking  with  OBIS-SEAMAP.  The  combined  capabilities 
will  be  unique  to  this  sound  collection,  and  will  greatly  enhance  the  accessibility  and  the  utility  of  the 
archive  to  scientists,  students,  educators,  military  personnel,  the  media,  and  the  general  public. 
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APPROACH  AND  WORKPLAN 


Specific  tasks  to  meet  the  first  of  our  three  objectives  included  creating  a)  browser-based  software  for 
visualizing  and  playing  back  digitized  sounds  stored  in  the  archive;  b)  mouse-driven  tools  for 
identifying  specific  segments  within  the  visual  image  of  the  sound;  c)  pull  down  menus  that  allow  the 
annotator  to  assign  standardized  metadata  terms  for  annotation  d)  suitable  metadata  structures  for 
storage  of  the  annotations  and  the  relevant  segment  delimitation  points  and  linkages  to  other  relevant 
metadata  fields;  e)  search  engines  that  support  the  invocation  of  annotation  tenns  during  searches 
along  with  other  standard  criteria;  and  f)  retrieval  tools  that  identify  relevant  parts  within  archived 
recordings.  The  software  was  designed  and  implemented  by  the  Macaulay  Library  Information 
Technology  team  in  collaboration  with  Totally  Hip  Technologies.  The  data  model  was  overseen  by  Mr. 
Tim  Levatich  of  the  Cornell  Lab  of  Ornithology  Information  Services  unit  with  extensive  design  ideas 
and  testing  by  the  Macaulay  Library  archival  and  distribution  units. 

To  achieve  the  second  objective,  we  sought  to:  a)  consult  with  the  marine  community  on  desired 
measures;  b)  create  algorithms  for  these  measures;  c)  implement  these  algorithms  so  that  they  can  be 
applied  directly  to  annotated  segments  in  the  archive;  d)  provide  suitable  metadata  structures  to  store 
the  extracted  feature  data  and  link  them  to  the  other  fields  pertinent  to  any  recording;  and  d)  test  and 
refine  the  measures  with  input  from  the  marine  community.  This  work  was  undertaken  by  Dr.  David 
Mellinger  of  Oregon  State  University  and  Macaulay  Library  director  Dr.  Jack  Bradbury  in  consultation 
with  Dr.  Kurt  Fristrup  (formerly  of  the  Cornell  Lab  of  Ornithology),  and  Dr.  Chris  Clark  and  his  staff 
of  the  Bioacoustics  Research  Program  (Cornell  Lab  of  Ornithology. 

The  third  objective  required  that  we:  a)  define  and  classify  the  types  of  location  information  currently 
provided;  b)  assemble  gazetteers  and  data  sources  for  replacing  names  and  descriptions  with  lat/long 
data;  c)  provide  tools  for  interpolation  and  error  circle  definition;  d)  generate  a  database  of  known 
locations  and  their  coordinates  to  accelerate  translations;  e)  design  data  model  for  easy  access  and 
federation  with  OBIS-SEAMAP.  This  work  was  overseen  by  Audio  Curator  Greg  Budney  at  the 
Macaulay  Library,  undertaken  by  the  Macaulay  Library  IT  staff,  and  tested  extensively  by  the 
Macaulay  Library  archival  and  distribution  teams. 

WORK  COMPLETED 

To  date,  the  project  has  completed;  a)  adaptation  of  the  metadata  model  for  the  Macaulay  Library  that 
would  support  annotation  and  retrieval  of  metadata  assigned  to  specific  segments  within  an  archived 
recording;  b)  design  and  implementation  of  the  online  player  and  visualization  tool  required  for 
demarcation  of  segments  within  recordings  by  annotators;  c)  selection  of  30  candidate  acoustic 
features,  their  implementation  in  Matlab,  and  their  posting,  along  with  documentation,  on  a  forum 
website  to  promote  testing  and  input  from  the  marine  bioacoustics  community;  d)  the  design  and 
testing  of  new  geographical  tools  for  conversion  of  location  data  into  coordinates;  and  e)  design  of  data 
exchange  and  federation  tools  that  will  facilitate  linkage  of  the  marine  archive  to  other  databases. 

RESULTS 

a)  Data  model  design  and  consequences:  Annotator  assigmnent  of  metadata  to  segments  within 
recordings  requires  a  clearly  defined  data  model  that  works  at  all  levels.  Some  fields  assigned  to  a 
recording  can  be  inherited  directly  by  “daughter”  segments,  but  other  fields  are  either  not  assignable  to 
whole  recording  or  existing  assignments  refer  to  general  trends  and  not  to  all  specific  segments. 
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Designing  the  software  that  could  make  these  assignments  intelligently  has  taken  a  very  large  amount 
of  thought  and  discussion.  There  are  also  subtle  and  tricky  database  structure  issues  that  needed  to  be 
resolved  (e.g.  should  daughter  segments  be  treated  as  separate  assets  or  not?).  In  addition,  an  entirely 
new  ontology  for  behavior  was  designed  and  evaluated  by  the  international  animal  behavior 
community.  The  initial  behavior  model  has  recently  been  enhanced  with  a  simplified  version  that  will 
allow  easier  data  entry  by  non-expert  internal  staff  and  more  meaningful  search  results  by  standard 
users.  By  adopting  this  dual  strategy,  we  preserve  the  original  specifications  and  data,  while  providing 
an  easier  searching  experience  to  all  users.  We  also  recently  added  better  tools  to  manage  some  of  the 
more  complex  data,  such  as  taxonomy.  These  tools  provide  the  means  for  archivists  to  correct  and 
otherwise  modify  taxonomy  and  naming  in  a  logical,  intuitive  fashion;  derivations  of  these  advances 
will  prove  very  beneficial  to  the  public  users  as  well.  An  online  graphic  user  interface  was  designed  for 
metadata  entry  by  Macaulay  archivists  that  limits  entries  to  allowed  fields  and  field  values.  This  same 
interface  is  used  to  channel  annotator  input  within  boundaries  set  by  the  Macaulay  data  model. 
Considerable  time  was  spent  working  with  the  archivists  on  their  input  tools  before  we  could  begin 
working  on  an  online  annotator  version.  The  output  (search)  side  was  equally  tricky,  although 
sufficient  forethought  while  designing  the  input  side  has  avoided  a  number  of  potential  pitfalls  and 
problems.  A  new  summary  results  page  was  recently  implemented  that  accesses  located  assets  more 
efficiently  and  simply. 

b)  Design  and  implementation  of  annotations  player:  The  online  player  and  visualization  tool  was 
made  available  online  for  beta  testing  mode  in  April  2005.  Input  from  users  was  very  positive  and 
resulted  in  numerous  refinements  and  improvements.  Among  these  was  the  addition  of  a  power 
spectrum  view,  (complementing  the  waveform  and  spectrogram  views  already  implemented),  saveable 
and  customizable  settings,  the  ability  to  open  multiple  windows  to  compare  an  archived  specimen  to 
one  on  the  user’s  desktop,  and  faster  loading  and  display  procedures.  We  also  resolved  several  bugs 
present  in  specific  platforms.  In  the  last  year,  the  player  was  adapted  to  accept  segment  demarcation  by 
an  expert  annotator,  assign  non-inherited  metadata  within  the  overall  Macaulay  Library  model  using  an 
interface  similar  to  that  used  by  Macaulay  archivists,  and  indicate  locations  of  annotated  segments  in 
recordings  retrieved  in  a  search.  The  visualization  player  was  awarded  second  prize  for  “Interactive 
Multimedia”  in  the  Fall  2006  NSF/AAAS  “Science  and  Engineering  Visualization  Challenge”  contest. 
Details  on  this  prize  are  posted  at: 

http://www.eotepic.org/modules.php?op=modload&name=News&file=article&sid=682. 

c)  Feature  extraction  tools:  Based  on  prior  published  research  by  the  marine  community,  a  workshop 
held  at  the  Cornell  Lab  of  Ornithology,  and  detailed  input  from  NOPP  partners,  30  candidate  acoustic 
features  for  future  search  and  retrieval  within  the  archive  were  identified  and  implemented  as  Matlab 
routines.  These  are  currently  posted  on  a  website  (hup://ml source. ornith.cornell.edu/ ethodata/ features/) 
where  they  can  be  retrieved  for  testing  and  feedback  provided  through  an  online  forum.  The  relevant 
software  for  extraction  of  these  features  from  annotated  segments  is  largely  complete  but  will  not  be 
implemented  until  sufficient  time  has  been  provided  for  possible  substitutions  and  refinements 
provided  by  the  marine  bioacoustics  community. 

c)  Annotator  registration  and  security:  At  least  at  this  stage,  the  Macaulay  Library  only  intends  to 
allow  authorized  and  registered  experts  to  annotate  segments  of  archived  recordings.  This  requires  an 
online  system  for  registering  approved  annotators  and  providing  them  with  password  access  to  the 
annotation  tools.  Registration  is  a  general  task  that  is  required  for  several  other  classes  of  Macaulay 
Library  users.  At  the  moment,  anyone  wishing  to  download  archived  sounds  must  pay  a  fee.  Fees  are 
graduated  with  commercial  users  paying  the  most  and  academic  or  nonprofit  institutions  less.  Once  a 


3 


fee  is  paid,  the  client  will  be  allowed  registered  access  for  the  selected  downloads.  In  addition,  we 
intend  to  allow  registered  researchers  and  their  staff  free  registration  for  a  limited  number  of 
downloads  per  month.  We  are  thus  working  on  a  general  protocol  for  handling  registrations,  checking 
authenticities,  and  monitoring  downloading  that  can  handle  each  class  of  registered  users.  Until  this 
protocol  is  complete,  we  cannot  deploy  the  otherwise  completed  annotation  tools.  It  is  hoped  to 
complete  this  process  in  the  next  6-8  months.  We  are  also  holding  deployment  of  the  annotation  tools 
while  we  discuss  possible  collaborations  with  producers  of  commercial  sound  analysis  and  annotation 
software.  Researchers  wish  to  download  annotated  sound  files  from  the  Macaulay  archive  directly  into 
their  analysis  software  with  annotations  intact.  Similarly,  sound  files  annotated  in  commercial  software 
packages  greatly  increases  the  rate  at  which  annotated  files  can  be  acquired  by  the  archive.  Some  fonn 
of  collaborative  effort  between  the  Macaulay  Library  and  these  software  developers  appears  to  be  in 
everyone’s  interest  and  is  currently  under  discussion. 

d)  Usage:  Since  the  newer  versions  of  the  online  tools  became  available  this  summer,  usage  of  the 
Macaulay  Library  website  has  increased  exponentially.  This  applies  both  to  the  existing  terrestrial 
collection  and  the  marine  collection.  Graphs  of  user  demands  for  recordings  are  displayed  below: 
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Figure  1.  Rapid  increase  is  usage  of  terrestrial  (left)  and  marine  (right)  recordings  since  July  2006 

IMPACT  AND  IMPLICATIONS 
Economic  Development 

The  new  annotation,  visualization,  and  search/retrieval  tools  should  significantly  facilitate  remote 
access  to  the  world’s  largest  archive  of  animal  sounds.  Urgent  demands  by  television  and  radio 
programs  for  exemplars  of  a  particular  species’  sounds  can  now  be  filled  quickly  and  even  remotely  by 
the  programming  staff.  Similarly,  public  institutions  such  as  zoos  or  museums  that  would  like  to 
compile  a  set  of  sounds  demonstrating  specific  principles  can  now  browse  the  collection  remotely  and 
select  the  material  they  need.  There  is  an  increasingly  large  industry  creating  nature-oriented  products 
and  many  of  these  companies  come  to  the  Macaulay  Library  for  authenticated  rich  media  resources. 
The  tools  provide  them  with  a  way  to  both  ensure  accuracy  and  find  sounds  in  the  archive  that  they 
feel  best  meet  their  criteria. 

Quality  of  Life 


4 


The  Macaulay  Library  has  a  long  tradition  of  supplying  reference  sounds  to  wildlife  management 
programs  worldwide.  Acoustic  censusing  has  become  a  primary  means  for  assessing  biological 
community  health  in  both  forested  and  marine  environments.  Access  to  reference  sounds  is  key  to  such 
programs.  Our  new  tools  will  greatly  facilitate  remote  selection  of  material  by  these  agencies. 

Science  Education  and  Communication 

Animal  behavior  and  in  particular,  marine  animal  behavior,  is  a  topic  of  natural  interest  to  children.  It 
is  one  that  can  easily  be  used  as  a  springboard  into  biology  and  other  STEM  topics  including 
physiology,  economics,  mathematics,  chemistry,  and  physics.  The  extensive  rich  media  archives  of  the 
Macaulay  Library  can  be  used  to  create  those  springboards  and  enhance  STEM  teaching  at  all  levels. 
Because  the  archives  are  so  large,  it  can  be  daunting  to  teachers  and  curriculum  writers  to  attempt  to 
find  optimal  materials  for  a  given  task.  The  new  preview,  annotation,  and  search  tools  will  greatly 
facilitate  access  by  the  education  community,  and  the  Macaulay  Library  recently  hired  a  new 
“information  broker”  to  serve  as  a  guide  to  this  community  on  use  of  the  archives. 

TRANSITIONS 

Economic  Development 

The  Macaulay  Library  resources  are  widely  used  by  commercial  entities,  museums,  zoos,  aquaria, 
science  centers,  education,  and  the  media.  While  the  original  focus  of  the  visualization,  annotation, 
search  and  retrieval  tools  was  largely  to  facilitate  research,  we  are  finding  that  they  have  greatly 
increased  access  and  utility  of  the  archive  to  these  many  other  users.  A  number  of  commercial  products 
using  Macaulay  Library  sounds  are  currently  quite  popular  and  indirectly  promote  public  awareness  of 
nature  and  conservation.  One  book  containing  our  bird  sounds  is  among  the  top  10  products  being 
purchased  this  holiday  season. 

Quality  of  Life 

The  new  software  tools  have  greatly  enhanced  the  ability  of  conservation  programs,  marine  refuge 
managers,  and  wildlife  biologists  to  obtain  reference  sounds  that  they  need  for  their  projects  and  their 
interfacing  with  the  public.  Macaulay  Library  staff  have  participated  in  numerous  national  meetings  on 
marine  animal  conservation  (ECOUS,  NOAA),  and  are  active  collaborators  in  several  ongoing  marine 
conservation  projects. 

Science  Education  and  Communication 

As  a  twice-funded  member  of  the  National  Science  and  Mathematics  Digital  Library  (NSDL)  program, 
the  Macaulay  Library  and  the  Education  Unit  at  the  Cornell  Lab  of  Ornithology  are  very  actively 
involved  in  the  development  of  science  curricula  including  topics  involving  marine  animal  behavior 
and  ecology.  We  collaborate  with  a  number  of  marine  parks  and  research  programs  to  share 
educational  materials.  Perhaps  as  important,  the  general  public  now  has  free  access  to  hearing  and 
using  the  new  visualization  tools  for  any  sound  in  the  Macaulay  Library  archive. 


RELATED  PROJECTS 
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This  NOPP  project  has  been  complemented  by  concurrent  grants  from  the  Office  of  Naval  Research 
that  funded  the  acquisition  and  archival  of  marine  animal  sounds  (N000 14-02- 1-0467).  The 
educational  outreach  has  been  funded  by  two  NSF-NSDL  awards  (DUE-0332872  and  DUE-0532786). 
An  earlier  NSF  award  (LBN-0337507)  funded  the  design  of  the  data  model  relating  to  behavior. 
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